Effects of allophones on the performance of Korean speech recognition
نویسندگان
چکیده
This paper investigates the effects of allophones on the performance of Korean speech recognition systems. Along with a baseline phone-like unit (PLU) set consisting of phonemes, 31 allophone-based PLU sets are designed by systematically considering 5 major Korean allophonic constraints which can describe all the PLU sets currently used for Korean speech recognition systems. Experiments for phone, word, and continuous speech recognition are performed using the proposed PLU sets. The results show that the allophone-based PLU sets improve recognition performance compared to using a baseline phoneme-based PLU set. The performance improvement is clearly evident in phone recognition for isolated speech and in isolated word and continuous speech recognition using context independent units. As predicted, the performance improvement is less evident when context dependent (CD) units are used in the experiments, since the allophonic information is internalized in the CD units. Finally, the constraint VOICING-LAX is observed as playing a positive role compared to other constraints that are only partly influential.
منابع مشابه
Allophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملSynthesized Fricative ch Specific Features and Influence on Speech Quality Analysis
One of speech synthesis main problems is synthesis of unvoiced fricatives. One of our previously stated conclusions is that consonant x is influenced by before and behind existing phonetic elements. The aim of experiments described in this paper is to evaluate influence of different x allophones for speech intelligibility and automatic speech recognition. In this paper the formal system, which ...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملSpeech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کاملMonolingual and Bilingual Spanish-Catalan Speech Recognizers Developed from SpeechDat Databases
Under the SpeechDat specifications, the Spanish member of SpeechDat consortium has recorded a Catalan database that includes one thousand speakers. This communication describes some experimental work that has been carried out using both the Spanish and the Catalan speech material. A speech recognition system has been trained for the Spanish language using a selection of the phonetically balance...
متن کامل